A New Czech Morphological Analyser ajka
نویسندگان
چکیده
This paper deals with the effective implementation of the new Czech morphological analyser ajka which is based on the algorithmic description of the Czech formal morphology. First, we present two most important word-forming processes in Czech — inflection and derivation. A brief description of the data structures used for storing morphological information as well as a discussion of the efficient storage of lexical items (stem bases of Czech words) is included too. Finally, we bring some interesting features of the designed and implemented system ajka together with current statistic data.
منابع مشابه
Fast Morphological Analysis of Czech
This paper presents a new Czech morphological analyser which takes an advantage of Jan Daciuk’s algorithms for minimal deterministic acyclic finite state automata. The new analyser is six times faster than the current analyser ajka concerning the proper analysis, i.e. returning possible lemmata and tags for a given word form, but for some other related tasks is the difference even bigger.
متن کاملRelations between Inflectional and Derivation Patterns
One of the main goals of this paper is to describe a formal procedure linking inflectional and derivational processes in Czech and to indicate that they can be, if appropriate tools and resources are used, applied to other Slavonic languages. The tools developed at the NLP Laboratory FI MU, have been used, particularly the morphological analyser ajka and the program I par for processing and mai...
متن کاملFrom Czech Morphology through Partial Parsing to Disambiguation
This paper deals with a complex system of processing raw Czech texts. Several modules were implemented which perform different levels of processing. These modules can easily be incorporated into many other linguistic applications and some of them are already exploited in this way. The first level of processing raw texts represents a reliable morphological analysis – we give a survey of the effe...
متن کاملEnriching WordNet with Derivational Subnets
In this paper, we deal with the derivational (word formation) relations as they are handled by the Czech morphological module Ajka. First, we show that they represent empirically well-based semantic relations forming small semantic networks, and then we solve the problem how to integrate them into lexical database such as (Czech) WordNet. In this respect we examine the relation between the deri...
متن کاملTowards Czech Morphological Guesser
This paper presents a morphological guesser for Czech based on data from Czech morphological analyzer ajka [1]. The idea behind the presented concept lies in a presumption that the new (and therefore unknown to the analyzer) words in a language behave quite regularly and that a description of this regular behaviour can be extracted from the existing data of the morphological analyzer. The paper...
متن کامل